Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity
نویسنده
چکیده
The Dirichlet distribution (Dir) is one of the most widely used prior distributions in statistical approaches to natural language processing. The parameters of Dir are required to be positive, which significantly limits its strength as a sparsity prior. In this paper, we propose a simple modification to the Dirichlet distribution that allows the parameters to be negative. Our modified Dirichlet distribution (mDir) not only induces much stronger sparsity, but also simultaneously performs smoothing. mDir is still conjugate to the multinomial distribution, which simplifies posterior inference. We introduce two simple and efficient algorithms for finding the mode of mDir. Our experiments on learning Gaussian mixtures and unsupervised dependency parsing demonstrate the advantage of mDir over Dir. 1 Dirichlet Distribution The Dirichlet distribution (Dir) is defined over probability vectors x = 〈x1, . . . , xn〉 with positive parameter vector α = 〈α1, . . . , αn〉:
منابع مشابه
Dirichlet draws are sparse with high probability
This note provides an elementary proof of the folklore fact that draws from a Dirichlet distribution (with parameters less than 1) are typically sparse (most coordinates are small). 1 Bounds Let Dir(α) denote a Dirichlet distribution with all parameters equal to α. Theorem 1.1. Suppose n ≥ 2 and (X1, . . . , Xn) ∼ Dir(1/n). Then, for any c0 ≥ 1 satisfying 6c0 ln(n) + 1 < 3n, Pr [∣∣∣∣{i : Xi ≥ 1...
متن کاملTopic Models with Sparse and Group-Sparsity Inducing Priors
The quality of topic models highly depends on quality of used documents. Insufficient information may result in topics that are difficult to interpret or evaluate. Including external data to can help to increase the quality of topic models. We propose sparsity and grouped sparsity inducing priors on the meta parameters of word topic probabilities in fully Bayesian Latent Dirichlet Allocation (L...
متن کاملAnalysis of Variational Bayesian Latent Dirichlet Allocation: Weaker Sparsity Than MAP
Latent Dirichlet allocation (LDA) is a popular generative model of various objects such as texts and images, where an object is expressed as a mixture of latent topics. In this paper, we theoretically investigate variational Bayesian (VB) learning in LDA. More specifically, we analytically derive the leading term of the VB free energy under an asymptotic setup, and show that there exist transit...
متن کاملA concave regularization technique for sparse mixture models
Latent variable mixture models are a powerful tool for exploring the structure in large datasets. A common challenge for interpreting such models is a desire to impose sparsity, the natural assumption that each data point only contains few latent features. Since mixture distributions are constrained in their L1 norm, typical sparsity techniques based onL1 regularization become toothless, and co...
متن کاملHierarchical Compound Poisson Factorization
Non-negative matrix factorization models based on a hierarchical Gamma-Poisson structure capture user and item behavior effectively in extremely sparse data sets, making them the ideal choice for collaborative filtering applications. Hierarchical Poisson factorization (HPF) in particular has proved successful for scalable recommendation systems with extreme sparsity. HPF, however, suffers from ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016